Extending mixtures of multivariate t-factor analyzers

نویسندگان

  • Jeffrey L. Andrews
  • Paul D. McNicholas
چکیده

Model-based clustering typically involves the development of a family of mixture models and the imposition of these models upon data. The best member of the family is then chosen using some criterion and the associated parameter estimates lead to predicted group memberships, or clusterings. This paper describes the extension of the mixtures of multivariate t-factor analyzers model to include constraints on the degrees of freedom, the factor loadings, and the error variance matrices. The result is a family of six mixture models, including parsimonious models. Parameter estimates for this family of models are derived using an alternating expectation-conditional maximization algorithm and convergence is determined based on Aitken’s acceleration. Model selection is carried out using the Bayesian information criterion (BIC) and the integrated completed likelihood (ICL). This novel family of mixture models is then applied to simulated and real data where clustering performance meets or exceeds that of established model-based clustering methods. The simulation studies include a comparison of the BIC and the ICL as model selection techniques for this novel family of models. Application to simulated data with larger dimensionality is also explored.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mixtures of common t-factor analyzers for clustering high-dimensional microarray data

MOTIVATION Mixtures of factor analyzers enable model-based clustering to be undertaken for high-dimensional microarray data, where the number of observations n is small relative to the number of genes p. Moreover, when the number of clusters is not small, for example, where there are several different types of cancer, there may be the need to reduce further the number of parameters in the speci...

متن کامل

Maximum likelihood estimation in constrained parameter spaces for mixtures of factor analyzers

Mixtures of factor analyzers are becoming more and more popular in the area of model based clustering of multivariate data. According to the likelihood approach in data modeling, it is well known that the unconstrained likelihood function may present spurious maxima and singularities. To reduce such drawbacks, in this paper we introduce a procedure for parameter estimation of mixtures of factor...

متن کامل

Calibration of Infra-red CO(2) Gas Analyzers.

Precision gas mixing pumps produce CO(2) gas mixtures for the calibration of infra-red CO(2) gas analyzers equivalent in accuracy to the standard CO(2) gas mixtures (+/- 1%) supplied by the National Bureau of Standards, Washington, D. C.The calibration of infra-red gas analyzers by the pressure difference method and by concentration differences did not agree. A factor of 0.709 was necessary to ...

متن کامل

Mixtures of robust probabilistic principal component analyzers

Mixtures of probabilistic principal component analyzers model high-dimensional nonlinear data by combining local linear models. Each mixture component is specifically designed to extract the local principal orientations in the data. An important issue with this generative model is its sensitivity to data lying off the low-dimensional manifold. In order to address this problem, the mixtures of r...

متن کامل

Adaptive Mixtures of Factor Analyzers

A mixture of factor analyzers is a semi-parametric density estimator that generalizes the well-known mixtures of Gaussians model by allowing each Gaussian in the mixture to be represented in a different lower-dimensional manifold. This paper presents a robust and parsimonious model selection algorithm for training a mixture of factor analyzers, carrying out simultaneous clustering and locally l...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Statistics and Computing

دوره 21  شماره 

صفحات  -

تاریخ انتشار 2011